minkowski-r back-propagation
Minkowski-r Back-Propagation: Learning in Connectionist Models with Non-Euclidian Error Signals
Many connectionist learning models are implemented using a gradient descent in a least squares error function of the output and teacher signal. For small r's a "city-block" error metric is approximated and for large r's the "maximum" or "supremum" metric is approached. An implementation of Minkowski-r back-propagation is described. Different r values may be appropriate for the reduction of the effects of outliers (noise).
Minkowski-r Back-Propagation: Learning in Connectionist Models with Non-Euclidian Error Signals
Hanson, Stephen Jose, Burr, David J.
It can be shown that neural-like networks containing a single hidden layer of nonlinear activation units can learn to do a piece-wise linear partitioning of a feature space [2]. One result of such a partitioning is a complex gradient surface on which decisions about new input stimuli will be made. The generalization, categorization and clustering propenies of the network are therefore detennined by this mapping of input stimuli to this gradient swface in the output space. This gradient swface is a function of the conditional probability distributions of the output vectors given the input feature vectors as well as a function of the error relating the teacher signal and output.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Minkowski-r Back-Propagation: Learning in Connectionist Models with Non-Euclidian Error Signals
Hanson, Stephen Jose, Burr, David J.
It can be shown that neural-like networks containing a single hidden layer of nonlinear activation units can learn to do a piece-wise linear partitioning of a feature space [2]. One result of such a partitioning is a complex gradient surface on which decisions about new input stimuli will be made. The generalization, categorization and clustering propenies of the network are therefore detennined by this mapping of input stimuli to this gradient swface in the output space. This gradient swface is a function of the conditional probability distributions of the output vectors given the input feature vectors as well as a function of the error relating the teacher signal and output.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)